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ABSTRACT 

Project  JP2072  aims  to  enhance  communications  in  the  land  environment,  providing  more 
capacity  for  computer  data  traffic.  One  potential  option  is  an  Internet  Protocol-based  network 
carrying  both  voice  and  data.  An  important  consideration  for  a  converged  network  is  whether 
or  not  acceptable  voice  services  can  be  supported  given  the  additional  data  traffic  load.  This 
report  describes  a  method  to  estimate  the  cjuahty  of  Voice  over  IP  calls  in  the  presence  of  other 
network  traffic,  based  on  the  ITU  E-Model.  Required  inputs  to  the  E-Model  are  determined 
from  a  computer  simulation  of  the  network.  We  illustrate  the  appHcation  of  the  model  to  a 
reference  Parakeet  network.  Two  companion  reports  present  corresponding  simulations  of 
ATM-based  solutions,  which,  together  with  the  simulation  described  in  this  report,  wUl  enable 
an  evaluation  of  the  options  proposed  for  project  JP2072. 
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Communications  System  (Land) 


Executive  Summary 

There  is  an  increasing  demand  for  computer  data  communications  in  the  land 
environment,  and  the  existing  circuit-switched  network  infrastructure  is  not  well 
suited  for  the  carriage  of  such  traffic.  This  issue  is  to  be  addressed  by  project  JP2072, 
which  will  provide  enhanced  communications  for  land  forces.  This  is  likely  to  result  in 
a  move  to  a  converged  (voice  and  data),  packet-based  network  infrastructure,  possibly 
based  on  Voice  over  Internet  Protocol  (VoIP)  or  Asynchronous  Transfer  Mode  (ATM) 
technology.  It  is  expected  that  such  solutions  would  provide  additional  bandwidth  for 
data  traffic.  However,  it  is  crucial  that  an  acceptable  quahty  of  voice  services  is 
maintained  as  additional  data  is  added  to  the  tactical  network.  Hence  there  is  a  need  to 
estimate  the  impact  a  given  data  traffic  load  will  have  on  the  quality  of  voice  calls 
carried  by  the  network,  for  a  particular  technological  solution.  This  report  describes  a 
method  for  making  such  an  assessment  when  voice  services  are  implemented  using 
VoIP  on  a  purely  IP-based  network.  We  illustrate  the  method  by  applying  it  to  a 
reference  Parakeet  network. 

Our  approach  is  based  on  the  International  Telecommunication  Union  (ITU)  E-Model  - 
a  standard  analytic  method  that  allows  the  quality  of  a  voice  call  as  perceived  by  the 
caller  (and  rated  on  a  scale  of  0  to  100)  to  be  estimated  from  objective  measures  relating 
to  the  network  and  to  the  terminal  equipment  at  each  end.  We  obtain  the  measures 
required  for  input  to  the  E-Model  by  executing  a  computer  simulation  of  the  network, 
using  the  commercial  network  simulator  OPNET.  With  this  software  a  model  can  be 
created  to  represent  any  network  topology,  supporting  any  voice  call  and  application 
data  load.  Our  custom  modifications  to  the  simulator  directly  provide  complete 
statistics  for  the  E-Model  transmission  rating  factor,  for  any  pair  of  participating  nodes. 
The  simulator  can  also  be  used  to  determine  the  impact  of  other  network  characteristics 
on  voice  call  quality,  such  as  the  choice  of  Quality  of  Service  (QoS)  policies. 

Two  companion  papers  describe  a  correspondmg  simulation  for  the  evaluation  of  voice 
call  quality  in  ATM-based  networks.  Once  configured  with  realistic  application  data 
profiles,  these  simulations  will  enable  a  thorough  comparison  to  be  made  between  the 
two  technologies,  and  will  facilitate  the  choice  of  suitable  network  parameters  in  each 
case.  This  analysis  will  be  used  in  the  evaluation  of  options  proposed  for  proiect 
JP2072.  ^  ’ 
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1.  Introduction 


RGCGntly  thGrG  has  bcGii  considGrablc  intGrcst  in  migrating  military  communications 
mfrastructurG  from  the  existing  situation,  where  voice  telephony  services  and 
computer  data  services  are  provided  by  separate,  stove-piped  networks,  to  a 
converged  solution  where  both  traffic  types  can  exist  on  a  single  network.  Using  the 
same  network  for  voice  and  data  is  expected  to  make  system  maintenance  easier,  and 
the  use  of  packetised  voice  should  make  additional  bandwidth  available  for  the  ever- 
increasing  demand  for  data  services  (during  the  gaps  between  periods  of  speech).  A 
number  of  technologies  have  been  developed  to  implement  packetised  voice  -  two  of 
the  most  important  being  based  on  Internet  Protocol  (IP)  and  Asynchronous  Transfer 
Mode  (ATM)  -  and  there  is  a  need  to  compare  the  performance  of  proposed  solutions 
for  voice  and  data  integration  under  expected  conditions  of  use  in  military  operations. 
In  general,  the  complexity  of  these  networks  rules  out  an  approach  based  on 
mathematical  analysis  alone  (for  example,  using  queueing  theory).  Useful  performance 
metrics  can  be  measured  in  laboratory  testbeds  (such  as  the  AppHcation  Performance 
Testbed  for  ADF  Communications  -  APTAC  -  used  in  DSTO),  but  such  experiments 
are  usually  restricted  to  small  networks  and  produce  results  that  are  not  readily 
generalised  to  larger  scales.  In  some  cases  the  technology  to  be  assessed  is  not  yet 
available,  or  the  operating  conditions  of  interest  are  not  easily  reproduced  in  the  lab.  In 
such  situations  computer  simulation  of  the  network  can  provide  a  useful  tool  for 
perform^ce  analysis.  Network  simulations  can  reproduce  protocol  behaviour  and 
communications  effects  to  any  required  level  of  detail;  they  can  be  scaled  up  to 
investigate  large  topologies  which,  realistically,  are  not  feasible  to  assemble;  and  they 
can  incorporate  new  component  designs  that  are  not  yet  available  in  real  networks.  In 
this  report  we  describe  the  development  of  a  simulation  model  that  can  be  used  to 
investigate  the  performance  of  Voice  over  IP  (VoIP)  in  mUitary  networks.  Two 
companion  reports  [1, 2]  apply  the  concepts  presented  here  to  address  the  performance 
of  ATM-based  solutions.  A  fourth  paper  in  the  series  compares  the  results  obtained  for 
each  of  the  three  proposed  solutions  [3]. 

Previous  simxdation  studies  of  VoIP  have  focused  on  a  nximber  of  network-level 
performance  measures  that  are  known  to  impact  on  call  quahty,  namely,  the  end-to- 
end  packet  delay,  delay  variation  (jitter)  and  packet  loss  between  the  source  and 
destination  [4,  5,  6].  However,  it  is  difficult  to  make  network  design  decisions  when 
these  three  measures  are  considered  separately.  For  example,  will  voice  call  quahty  be 
improved  by  decreasing  the  end-to-end  packet  delay,  if  this  is  at  the  expense  of 
increasing  the  packet  loss  ratio?  The  International  Telecommunication  Union  (ITU)  has 
standardised  the  E-Model,  a  framework  that  allows  a  quantitative  estimate  to  be 
derived  for  the  perceived  quality  of  a  voice  call,  based  on  characteristics  of  the  user 
terminals  and  the  connection  between  them.  The  result  of  applying  the  E-Model  is  a 
rating  for  the  call  quahty  on  a  scale  of  0  (worst)  to  100  (best).  The  E-Model  enables  the 
comparison  of  different  network  configurations  with  respect  to  voice  quahty  by 
combining  the  various  network-level  performance  metrics  into  a  single  voice  rating 
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factor.  The  E-Model  has  been  widely  used  to  predict  the  performance  of  existing  and 
planned  networks  [7,  8,  9,  10],  however,  previous  studies  have  either  rehed  on 
measurements  taken  from  existing  networks  to  provide  the  necessary  E-Model  inputs, 
or  else  have  derived  approximations  for  the  inputs  based  on  simplifying  assumptions. 

For  this  study  we  have  implemented  the  ITU  E-Model  in  the  high-fideUty  network 
simulator  OPNETtm  OPNET  incorporates  a  hbrary  of  very  detailed  models  of 
network  devices  and  protocols,  including  VoIP  and  other  network  applications  as  well 
as  the  Quality  of  Service  (QoS)  pohcies  implemented  in  modem  routers.  In  our 
implementation  the  voice  quahty  rating  can  be  selected  as  an  output  statistic  to  be 
collected  during  simrdation  execution,  in  the  same  way  as  other  pre-defined 
apphcation  performance  metrics  (such  as  web  page  download  time,  for  example).  We 
can  thus  use  the  tool  to  specify  the  network  topology,  configure  the  traffic  load  with  a 
high  degree  of  realism,  and  experiment  with  alternative  QoS  configurations  to  optimise 
the  service  provided  by  the  network  to  both  non-voice  traffic  and,  with  our  E-Model 
customisations,  voice  traffic. 

We  have  applied  our  OPNET  implementation  of  the  E-Model  to  a  reference  network 
appearing  in  early  Parakeet  specifications  and  described  by  Blair  and  Jana  [12].  This 
reference  specifies  the  network  topology  and  the  expected  number  of  voice  calls 
between  each  pair  of  nodes;  both  of  these  aspects  are  reproduced  in  otu  OPNET  model. 
We  have  also  investigated  the  impact  of  including  non-voice  data  traffic  on  the 
network  in  addition  to  the  voice  traffic  load.  Although  we  do  not  yet  have  an  endorsed 
data  traffic  profile,  we  have  demonstrated  the  way  in  which  such  a  profile  will  be 
implemented  in  OPNET  once  it  is  available,  and  the  type  of  experiments  that  can  be 
conducted  with  the  resxilting  model. 

The  structure  for  the  remainder  of  this  report  is  as  follows.  Section  2  explains  the  ITU 
E-Model,  and  describes  our  implementation  of  the  E-Model  using  OPNET.  In  Section  3 
we  describe  the  configuration  of  the  OPNET  model  of  the  Parakeet  reference  network, 
and  present  the  results  of  a  number  of  simulation  runs  tmder  different  conditions.  In 
Section  4  we  discuss  some  of  the  issues  arising  in  the  apphcation  of  the  E-Model. 
Appendix  A  includes  more  detailed  results  for  each  of  the  scenarios  simulated. 


2.  Predicting  Voice  Call  Quality 

2.1  The  E-Model 

rrU-T  Recommendation  G.107  specifies  the  algorithm  for  the  so-called  E-Model,  a 
computational  model  that  can  be  used  to  predict  the  overall  quahty  of  a  voice  cah 
given  a  number  of  transmission  parameters  [13].  The  output  of  the  model  is  the  voice 
quality  'transmission  rating  factor',  which  is  interpreted  as  a  measure  of  customer 
satisfaction  on  a  scale  of  0  (worst)  to  100  (best).  The  standard  describes  how  the  rating 
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factor  may  be  transformed  into  other  commonly  used  measures  of  customer 
satisfaction,  such  as  the  Mean  Opinion  Score  (MOS).  It  is  stressed,  however,  that  the 
model  is  intended  to  be  used  only  for  relative  comparisons  of  transmission  conditions, 
rather  than  for  actual  customer  opinion  prediction.  The  algorithm  has  as  inputs  a  large 
number  of  parameters  describing  the  equipment  used  at  each  end  of  the  conversation 
and  the  conditions  for  the  line  connecting  them.  In  effect,  the  E-Model  provides  an 
empirical  fit  to  previously  collected  data  (gained  from  field  surveys  or  laboratory  tests), 
relating  these  transmission  parameters  to  user  satisfaction. 

De  Vleeschauwer  et  al.  [7]  have  investigated  the  prediction  of  subjective  voice  call 
quality  in  VoIP  networks,  based  on  the  E-model.  In  a  subsequent  paper  [8]  they  have 
used  the  E-model  to  determine  the  maximum  permissible  transmission  delay  for  VoIP 
calls  carried  over  satellite,  for  a  range  of  codecs,  in  order  that  call  quality  is  maintained 
at  expected  levels.  They  find  that  two  geostationary  sateUite  hops  will  result  in 
unacceptable  caU  quality  for  any  codec,  and  for  some  codecs  even  one  geostationary 
satellite  hop  is  unacceptable.  Cole  and  Rosenbluth  [9]  have  refined  the  approach, 
describing  a  method  for  monitoring  VoIP  applications  based  upon  a  reduction  of  the  E- 
model  to  transport-level,  measurable  quantities.  In  their  application  the  measurable 
quantities  (eg.  delay  and  packet  loss)  are  sampled  from  a  real  network  in  order  to 
continuously  monitor  the  performance  of  the  network.  We  have  applied  their  method 
to  our  simulated  network,  with  the  measurable  quantities  obtained  as  outputs  of  the 
simulation  execution.  In  this  way  we  can  predict  the  performance  of  planned  networks 
in  carrying  V oIP  calls,  as  a  tool  to  aid  with  network  design. 

The  output  of  the  E-model,  the  Transmission  Rating  Factor,  R,  is  computed  according 
to  the  following  expression: 

R  =  K-h-h-h  +  A,  (1) 

where  represents  the  basic  signal-to-noise  ratio;  is  a  combination  of  all 
impairments  that  occur  approximately  simultaneously  with  the  voice  signal; 
represents  the  impairments  due  to  delay;  represents  impairments  caused  by  low  bit- 
rate  codecs;  and  A  allows  for  compensation  of  impairment  factors  when  there  are  other 
advantages  of  access  or  cost  to  the  user.  Following  Cole  and  Rosenbluth  [9],  we  adopt 
default  values  for  the  first  two  terms  (given  in  G.107).  These  terms  are  determined  by 
the  characteristics  of  the  terminal  equipment  used  to  access  the  packet  network,  rather 
than  the  characteristics  of  the  network  itself.  We  also  drop  the  expectation  factor.  A, 
since  it  is  rather  subjective,  and  in  any  case  would  take  on  the  same  value  for  the 
different  options  being  compared.  We  therefore  obtain^: 

R  =  93.3-I,-I^.  (2) 

The  term  is  a  function  of  two  t3rpes  of  variables:  those  representing  delays  in  the 
network,  and  other  variables  (for  example,  representing  the  level  of  echo  suppression). 
The  latter  group  of  variables  are  assigned  default  values.  Three  variables  representing 

1  The  value  93.3  appearing  in  our  expression  differs  slightly  from  the  value  of  94.2  given  by  Cole 
and  Rosenbluth  [9]  due  to  changes  made  in  the  year  2000  revision  of  the  E-model. 
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delays  are  used:  1)  the  average,  absolute  one-way  mouth-to-ear  delay;  2)  T  the 
average,  one-way  delay  from  the  receive  side  to  the  point  in  the  end-to-end  path  where 
a  signal  coupling  occurs  as  a  source  of  echo;  and  3)  7^  the  average,  round-trip  delay  in 
the  four-wire  loop.  For  the  current  VoIP  scenario  with  no  circuit-switched  components 
these  delays  are  related  as 

T^^T  =  TJ1,  (3) 

and  /(I  is  a  function,  specified  in  G.107,  of  the  single  delay  T  [9].  This  delay  consists  of 
three  components:  1)  the  component  due  to  the  codec,  2)  the  network  delay,  and  3)  the 
delay  due  to  the  de-jitter  buffer^.  The  codec  delay  component  is  determined  from  the 
look-ahead  time,  the  frame  size  and  the  number  of  frames  per  packet.  The  network 
delay  component  will  be  measured  from  the  simulation,  and  includes  delays  due  to 
transmission,  propagation  and  queueing  in  routers.  The  de-jitter  buffer  is  required  to 
smooth  out  the  variations  in  the  inter-arrival  times  of  incoming  packets  in  order  to 
reconstruct  a  synchronous  bit  stream  for  playback  to  the  receiver.  It  therefore  removes 
jitter  at  the  expense  of  increased  delay  and  packet  loss  (due  to  over-run  and  imder-run 
of  the  buffer).  Calculation  of  the  delay  and  packet  loss  arising  due  to  the  de-jitter  buffer 
requires  knowledge  of  the  de-jitter  algorithm,  which  is  not  specified  as  part  of  the 
codec  (it  is  usually  proprietary). 

There  are  no  analytic  expressions  for  the  final  term,  the  equipment  impairment  4 ;  it 
must  be  determined  experimentally  for  a  particular  combination  of  codec,  loss 
concealment  algorithm,  packet  loss  distribution,  etc.  Some  measurements  for  common 
codecs  are  provided  in  Appendix  I  of  ITU-T  recommendation  G.113  [14].  Included  are 
impairment  values  for  the  G.729  codec  (Annexe  A,  with  speech  activity  detection  and  2 
frames  per  packet)  for  a  range  of  values  for  the  packet  loss  ratio,  when  the  packet  loss 
is  random.  Impairment  values  provided  for  other  codecs  indicate  that  the  impairment 
is  greater  when  packet  loss  is  correlated,  so  the  values  provided  for  G.729  should  be 
considered  optimistic.  Another  potential  problem  with  using  the  impairment 
measurements  from  G.113  is  that  we  have  no  way  of  extrapolating  these  measurements 
to  the  other  codec  variants  of  interest  (G.729B,  with  or  without  speech  activity 
detection  and  with  a  variable  number  of  frames  per  packet). 

Figiu-e  1  shows  the  predictions  of  the  E-model  as  a  function  of  transmission  delay  for 
the  G.729A  codec,  using  the  equipment  impairment  measurements  m  G.113. 
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G.729A  +  VAD,  2  frames/pkt 


one-way  transmission  delay  (ms) 
Figure  1:  E-model  predictions  for  C.729A  codec 


2.2  Implementation  of  the  E-Model  in  OPNET^ 

CXir  OPNET  implementation  of  the  E-Model  provides  additional  statistics 
(transmission  rating,  delay  impairment  and  equipment  impairment)  that  a  user  of  the 
software  can  select  for  collection  in  the  same  way  as  other  application-level  statistics. 
These  statistics  are  available  under  the  'Voice  Called  Party'  and  'Voice  Calling  Party' 
statistics  groups  for  each  node.  In  the  standard  models  supplied  with  OPNET  only 
'Voice  Called  Party'  statistics  provide  separate  values  for  each  source  node  at  the  other 
end  of  the  conversation;  'Voice  Calling  Party^  statistics  are  aggregated  across  aU  source 
nodes  into  a  single  value.  However,  we  are  interested  in  the  voice  quality  for  each  node 
pair  (for  a  given  node,  the  quality  of  the  call  will  be  strongly  dependent  on  which  node 
is  at  the  other  end).  We  therefore  modified  the  OPNET  process  models  implementing 
the  voice  application  in  order  to  obtain  separate  values  for  'Voice  Calling  Part/ 
statistics  also. 


We  used  OPNET  Version  8.O.C. 
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In  our  modified  models  an  estimate  for  the  transmission  rating  factor  is  generated  as 
each  VoIP  packet  arrives  at  its  destination,  in  accordance  witii  Equation  (2)  (these 
individual  samples  would  normally  be  subject  to  some  form  of  averaging  in  an  actual 
simulation  execution).  For  the  delay  impairment  we  have  used  a  simple  analytic  fit  for 
the  function  as  opposed  to  the  full  expression  given  in  G.107  [13],  We  used  the  fit 

provided  by  Cole  and  Rosenbluth  [9]: 

r  0.024r  r<  177.3ms 

^‘''^|o.024r  +  0.1l(r- 177.3)  177.3ms 

As  noted  above,  the  delay  T  is  made  up  of  three  components.  The  end-to-end  network 
delay  is  determined  from  the  simulation  by  subtracting  the  creation  time  of  the  packet 
(carried  by  the  packet  as  a  timestamp)  from  the  simulation  time  at  which  it  arrives  at 
its  destination.  The  codec  delay  is  fixed  at  45ms,  consisting  of  the  Sms  look-ahead  time 
required  for  the  G.729  codec  and  four  frames  of  size  10ms  each.  The  remaining  delay 
component  is  due  to  the  de-jitter  buffer. 

In  order  for  the  receiver  (listener)  to  tmderstand  the  sender  (speaker),  the  voice  packets 
must  be  played  out  continuously  at  a  uniform  rate.  However,  the  voice  packets  do  not 
arrive  at  the  receiver  in  such  a  uniform  fashion  given  that  network  delays  are  not 
constant.  The  role  of  a  de-jitter  buffer  is  to  receive  incoming  packets  and  to  store  them 
for  a  short  period  before  playing  them  out,  in  order  to  absorb  the  variations  in  the 
packet  delays  (and  possibly  to  correct  for  packets  arriving  in  the  wrong  order).  The  de¬ 
jitter  buffer  therefore  removes  jitter  at  the  expense  of  additional  delay  and  packet  loss. 
To  determine  the  impact  on  the  perceived  call  quality  in  the  model  we  need  to 
implement  the  de-jitter  buffer  in  OPNET.  However,  de-jitter  buffer  implementations 
are  not  standardised  in  the  way  that  codecs  are,  and  tend  to  be  specific  to  a  partictilar 
VoIP  software  implementation.  Oiu  approach  has  been  to  implement  an  E-policy  de¬ 
jitter  buffer  [15]  within  the  OPNET  VoIP  models,  as  a  very  simple,  generic  example.  If 
necessary  this  implementation  could  be  replaced  with  alternatives  in  order  to  assess 
the  impact  of  different  de-jitter  buffer  implementations  on  call  quality. 

The  E-policy  is  very  straightforward:  the  buffer  is  maintained  at  a  particxilar  size  until  a 
late  packet  arrives  that  has  missed  its  play-out  slot,  at  which  time  the  buffer  size  is 
increased  to  accommodate  the  late  packet.  Therefore,  there  is  no  packet  loss  at  the  de¬ 
jitter  bxiffer  induced  in  this  implementation,  and  the  total  delay  T  increases  in  a  step¬ 
wise  fashion  such  that  it  is  always  the  maximtun  of  the  delays  experienced  by  aU 
packets  up  to  that  point  in  time.  This  total  delay  T  is  used  in  Equation  (4)  to  compute 
the  delay  impairment  factor. 

Although  no  packets  are  discarded  by  the  de-jitter  buffer,  there  is  stiU  some  packet  loss 
in  the  network.  Meastued  values  of  the  equipment  impairment  factor  for  different 
values  of  the  packet  loss  ratio,  for  the  G.729A  codec  (with  speech  activity  detection  and 
two  frames  per  packet),  are  provided  in  G.113  [14];  we  use  an  analytic  fit  to  these 
measurements  from  Cole  and  Rosenbluth  [9]: 

«  11  +  40 ln(l  +  10e),  (5) 
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where  e  is  the  packet  loss  ratio.  As  each  packet  arrives  at  its  destination  we  compute 
the  packet  loss  ratio  to  be  used  in  this  expression  as  a  moving  average  with  a  10  second 
window,  using  sequence  munbers  to  keep  track  of  those  packets  that  have  not  been 
received  in  the  window.  The  window  is  offset  from  the  current  time  by  the  size  of  the 
de-jitter  buffer,  to  ensure  that  we  don't  count  packets  as  being  lost  prematurely. 

By  combining  the  utility  of  the  E-Model  in  predicting  voice  call  quality  with  the 
flexibility  and  high-level  of  detail  possible  with  network  simulation,  we  have  a  very 
useful  tool  for  assessing  the  suitabflity  of  VoIP  in  any  conceivable  network.  The  only 
potentially  significant  source  of  error  is  the  value  for  the  equipment  impairment, 
having  been  measured  for  a  different  codec  to  the  one  used  in  the  simulation. 


3.  Application  to  the  Parakeet  Reference  Model 

3.1  Configuration  of  the  Parakeet  Reference  Model 

We  have  apphed  our  modified  OPNET  voice  model  to  a  reference  Parakeet  network,  as 
specified  by  Blair  and  Jana  [12].  The  network  consists  of  twenty  nodes;  in  oiu 
simulation,  each  of  these  nodes  is  represented  by  a  subnet  containing  a  Local  Area 
Network  (LAN)  model  connected  to  a  router  model  (Figure  2).  Five  of  the  nodes  form 
the  core  of  the  network  and  the  corresponding  routers  are  connected  by  fibre  optic 
bilks  operating  at  2048  kbps,  as  shown  in  Figure  3.  The  other  links  between  routers  are 
radio  links:  satelhte  links  operating  at  512  kbps  and  line-of-sight  radio  relay  links,  with 
between  one  and  three  hops,  operating  at  2048  kbps.  Although  these  could  be 
represented  in  OPNET  using  radio  link  models,  we  have  chosen  to  use  point-to-point 
link  models  instead  since  all  of  these  links  are  operating  in  a  full  duplex  mode  and 
simulation  of  point-to-point  links  is  much  faster  than  that  of  radio  links. 

The  voice  traffic  profile  is  also  specified  for  the  reference  Parakeet  network  [12].  The 
voice  traffic  matrix  gives  the  average  number  of  calls  made  from  any  node  to  any  other 
node  during  a  24-hour  period.  The  expected  number  of  calls  between  two  nodes 
during  the  'busy  hour'  is  one  quarter  of  the  daily  average  (reproduced  in  Appendix  A). 
For  each  node  in  the  simulation  we  configrue  a  voice  apptication  profile  such  that  the 
average  total  number  of  external  calls  made  from  that  node  during  the  busy  hour  is 
consistent  with  the  voice  traffic  matrix  (calls  within  a  node  are  not  simulated).  This 
fixes  the  average  time  between  calls  made  from  a  given  node;  to  ensure  that  these  calls 
go  to  other  nodes  in  the  correct  proportions  we  ako  specify  the  voice  destination 
preferences  for  the  node  to  be  weighted  according  to  the  entries  in  the  voice  traffic 
matrix.  The  durations  of  all  calls  in  the  simulation  are  exponentially  distributed  with  a 
mean  duration  of  3  minutes.  All  calls  use  the  G.729  codec  with  four  frames  (each  of 
duration  10ms)  per  packet  and  with  speech  activity  detection  disabled. 
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As  yet  there  is  no  authoritative  data  traffic  profile  for  the  reference  Parakeet  network. 
We  have  conducted  simulations  with  web  browsing  of  images  in  addition  to  the  voice 
traffic  in  order  to  determine  the  capabilities  of  the  simulation  in  regard  to  investigating 
QoS  issues.  We  recognise  that  the  data  traffic  profile  we  have  used  is  probably  not  very 
realistic,  but  it  will  be  straightforward  to  modify  it  in  future  to  a  more  realistic  profile. 
The  same  web  application  profile  was  used  for  all  nodes,  with  10  users  conducting  web 
browsing  in  each  node.  The  profile  attributes  were  chosen  simply  to  have  a  noticeable 
impact  on  the  voice  traffic  without  overloading  the  network.  Page  downloads  occur  on 
average  every  10  seconds  while  the  user  is  browsing  (which  occurs  for  2  minutes  in 
every  5),  and  each  page  contains  an  average  of  7  images,  which  vary  in  size  between  2 
kilobytes  and  10  kilobytes.  The  maximum  IP  packet  size  is  1500  bytes  in  this  model. 

3.2  Results 

Each  set  of  results  described  below  was  generated  by  executing  the  simulation  for  72 
minutes  of  simulation  time.  Traffic  profiles  began  after  2  minutes  (providing  time  for 
the  routing  tables  to  stabilise),  and  result  collection  began  after  a  further  10  minutes 
(i.e.,  a  10  minute  'warm  up'  period,  followed  by  60  minutes  during  which  results  were 
collected).  As  described  in  Section  3.1,  voice  traffic  profiles  were  configured  to 
represent  the  'busy  hoiffi. 

Our  initial  simulation  executions  were  directed  at  measuring  the  quality  of  voice  calls 
in  the  reference  network  in  the  absence  of  any  non-voice  traffic  (that  is,  the  web 
browsing  traffic  profile  was  disabled).  We  find  that  the  transmission  rating  is  relatively 
constant  for  each  pair  of  nodes  with  the  same  number  of  satellite  hops  (the  average 
Transmission  Rating  Factors  for  calls  between  all  pairs  of  nodes  are  shown  in 
Appendix  A,  Table  4).  With  voice  traffic  only,  delays  in  this  network  are  dominated  by 
the  latency  of  satellite  links  and  the  Transmission  Rating  Factor  depends  only  on  the 
number  of  satellite  hops  in  the  path  between  the  two  nodes  -  taking  the  network  delay 
to  be  239  ms  x  the  munber  of  satellite  hops  gives  R  =  81.2  (0  hops),  R  =  63.7  (1  hop),  R 
=  31.7  (2  hops),  as  shown  in  Table  1.  In  conclusion,  for  this  codec,  any  satellite  fink  in 
the  path  wiU  result  in  a  call  quality  of  'low'  or  'poor',  even  without  the  delays,  jitter 
and  packet  loss  that  would  result  from  the  introduction  of  non-voice  traffic  to  the 
network. 


The  voice-only  simulation  was  also  executed  without  the  de-jitter  buffer 
implementation  (that  is,  the  delay  induced  by  the  de-jitter  buffer  was  ignored).  We  find 
that  the  transmission  rating  factors  were  only  stightly  higher  than  those  produced  with 
the  de-jitter  buffer  included,  as  shown  in  Table  1.  This  is  a  consequence  of  the  fact  that 
with  only  voice  traffic  the  load  on  the  network  is  relatively  low  so  there  is  no  packet 
loss  and  the  level  of  jitter  is  very  small,  so  the  E-pohcy  de-jitter  buffer  is  small  and 
introduces  a  small  additional  delay. 

The  main  motivation  in  developing  a  simulation  model  of  this  network  is  to  investigate 
issues  relating  to  the  integration  of  voice  and  data  traffic  (VoIP  traffic  on  its  own  is 
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relatively  simple  and  can  be  addressed  analytically  [12]).  As  discussed  in  Section  3.1/ 
we  chose  to  add  a  data  traffic  profile  based  on  web  browsing  of  large  images.  As 
additional  data  traffic  impacts  the  quality  of  voice  services,  and  itself  has  different 
measures  of  service  quality  (such  as  page  download  time),  it  is  necessary  to  configure 
routers  with  appropriate  policies  for  QoS  in  order  to  simultaneously  meet  the 
requirements  of  all  traffic  t5q)es.  However,  as  a  baseline  case,  we  have  executed  the 
simulation  with  the  web  browsing  profile  enabled  and  with  the  default  router  model 
implementing  a  First  In,  First  Out  packet  queue  of  infinite  length,  and  with  no 
differentiation  between  VoIP  and  HTTP  packets.  This  is  to  be  compared  with  the 
results  when  QoS  features  of  routers  are  enabled. 

The  results  for  the  baseline  case  without  QoS,  given  in  Table  1,  show  that  the  call 
quality  is  significantly  decreased  with  the  addition  of  data  traffic,  for  those  nodes 
separated  by  at  least  one  satellite  hop.  This  is  due  to  the  additional  delays  experience 
by  VoIP  packets  as  they  queue  in  the  routers,  given  the  additional  data  traffic.  The 
router  model  used  in  this  execution  has  an  infinite  packet  buffer,  so  these  delays  can 
grow  to  be  very  large.  In  fact,  the  average  transmission  rating  calculated  for  nodes 
separated  by  two  satellite  hops  is  negative,  indicating  that  the  delays  exceed  the  range 
for  which  the  E-model  can  be  validly  applied.  The  results  also  show  that  the 
Transmission  Rating  Factors  vary  significantly  more  than  when  only  voice  traffic  is 
present  (Table  5  in  Appendix  A  shows  the  average  Transmission  Rating  Factors  for 
between  all  pairs  of  nodes  when  data  traffic  is  included). 

The  results  for  the  baseline  case  indicate  that  the  additional  data  traffic  seriously 
degrades  the  quality  of  voice  services.  We  wish  to  determine  the  improvement  that  can 
be  gained  when  the  routers  are  configured  with  an  appropriate  QoS  policy.  The 
scheme  implemented  m  this  model  was  Priority  Queuing.  Priority  Queueing  gives 
absolute  higher  priority  to  the  packets  with  a  higher  Type  of  Service  (ToS)  value,  which 
in  our  case  is  voice,  than  to  those  with  a  lower  ToS  value  (eg.  HTTP).  This  means  that 
each  router  will  deliver  HTTP  packets  only  if  there  are  no  voice  packets  waiting  in  its 
queue.  We  therefore  expect  the  Priority  Queueing  scheme  to  significantly  improve  the 
quality  of  voice  services  at  the  expense  of  mcreasing  the  delays  for  HTTP  services. 

The  results  shown  in  Table  1  illustrate  the  improvement  achieved  by  implementing 
Priority  Queueing  in  the  routers.  The  Transmission  Rating  Factors  are  much  closer  to 
the  values  obtained  when  no  data  traffic  is  present  (although  they  are  stUl  lower  -  the 
Priority  Queueing  scheme  does  not  pre-empt  packets  that  are  already  being 
transmitted,  so  voice  packets  can  still  experience  some  queueing  delays).  The 
variations  in  the  transmission  ratings  are  also  lower,  as  can  be  seen  in  Table  6  of 
Appendix  A. 
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Table  1:  Summary  of  the  average  of  the  Transmission  Rating  Factors  over  all  node-pairs  with  0, 
1  and  2  satellite  hops,  for  each  of  the  scenarios  investigated. 


Scenario 

0  Satellite  Hops 

1  SateUite  Hop 

2  Satellite  Hops 

No  de-jitter  buffer  (voice  only) 

81.2 

63.7 

31.7 

De-jitter  buffer  included  (voice  only) 

81.2 

63.1 

30.8 

Voice  +  HTTP  traffic  (no  QoS) 

80.3 

25.1 

-21.6 

With  Priority  Queueing 

80.9 

58.9 

24.1 

The  results  in  Table  2  show  that  the  page  response  time  for  the  web  browsing  traffic  is 
not  significantly  increased  by  the  inclusion  of  Priority  Queueing.  This  result  will, 
however,  be  dependent  on  the  nature  of  the  traffic  profiles  supported  by  the  network. 
A  more  realistic  traffic  profile  may  favour  a  different  QoS  scheme. 

Table  2:  The  sample  mean  and  standard  deviation  of  the  web  traffic's  page  response  time  with 
and  without  PQ. 


Page  Response  Time 

Sample  mean  (sec) 

Standard  Deviation  (sec) 

No  QoS 

5.2 

1.2 

WithPQ 

5.3 

1.3 

4.  Conclusions 

We  have  incorporated  the  ITU  E-Model  for  quantifying  voice  call  quality  into  network 
simulations,  allowing  us  to  assess  the  suitability  of  proposed  networks  for  the 
transport  of  integrated  voice  and  non-voice  data  traffic.  The  high  level  of  detail  in  these 
simffiations  means  we  can  make  credible  predictions  for  measmes  of  service  quality  for 
networks  still  in  the  design  stage. 

The  E-Model  allows  voice  quality  predictions  to  be  made  for  a  wide  range  of  values  of 
some  parameters,  end-to-end  delay  being  one  example.  However,  the  equipment 
impairment  factor  has  not  been  reduced  to  a  function  of  codec  attributes  and  must  still 
be  measiu-ed  in  laboratory  trials  for  any  combinations  of  interest.  Particular  factors  that 
influence  the  equipment  impairment  factor  include  the  choice  of  codec;  whether  or  not 
speech  activity  detection  is  enabled;  the  munber  of  frames  per  packet;  the  packet  loss 
concealment  algorithm;  and  the  statistical  nature  of  the  packet  loss  occurring  in  the 
network.  There  are  a  very  large  number  of  combinations  for  these  factors,  and 
equipment  impairments  have  been  measured  for  only  a  small  fraction  of  these.  An 
alternative  to  subjective  listening  tests  is  the  ITU  Perceptual  Evaluation  of  Speech 
Quahty  (PESQ)  algorithm  [16],  which  compares  the  degraded  speech  at  the  destination 
with  the  reference  speech  as  input  at  the  source  to  compute  a  Mean  Opinion  Score 
(MOS)  value.  Although  this  allows  an  assessment  of  voice  qualify  for  codec 
combinations  that  have  not  been  tested  in  laboratory  trials,  the  algorithm  is  not 
practical  for  use  in  simulated  networks.  In  our  case  it  is  necessary  to  choose  the 
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impairment  measurement  for  the  codec  configuration  that  most  closely  matches  the 
case  of  interest,  and  hope  that  it  provides  a  good  estimate  of  the  true  impairment. 

Using  the  reference  Parakeet  network  we  have  illustrated  how  such  simulations  can  be 
used  to  investigate  different  network  configurations  carrying  integrated  voice  and  non¬ 
voice  data.  As  an  example  application  we  determined  the  improvement  in  call  quality 
achieved  when  Priority  Queueing  is  used  m  network  routers  as  opposed  to  a  simple 
First  In  First  Out  queue.  We  foxmd  that  in  this  example  the  transmission  rating  factor  is 
determined  largely  by  the  number  of  satellite  hops  between  source  and  destination, 
falling  sharply  with  each  sateUite  hop  due  to  the  additional  propagation  delay. 
Although  the  E-Model  (with  its  strong  dependence  on  delay)  is  appropriate  for 
assessing  commercial  telecommunications  networks,  where  users  would  certainly  be 
distracted  by  such  delays,  it  may  not  be  an  adequate  measure  of  utility  for  military 
users  who  have  different  expectations  and  who  can  use  doctrine  to  accommodate  the 
delay.  It  may  be  possible  to  develop  a  militarised  version  of  the  E-Model,  but  this 
would  probably  require  a  substantial  effort  in  laboratory  trials.  In  the  meantime,  the  E- 
Model  provides  the  only  framework  available  to  predict  voice  call  quality  from 
simulation  resxilts  alone. 
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Appendix  B:  Transmission  Rating  Factor  Results 
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Transmission  Rating  Factor  for  each  node  pair,  averaged  over  busy  hour  (voice  and  web  browsing,  with  no  QoS). 
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Table  6:  Transmission  Rating  Factor  for  each  node  pair,  averaged  over  busy  hour  (voice  and  web  browsing,  with  Priority  Queueing). 
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